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Abstract — This paper deals with the estimation of a sequence 
of frequencies from a corresponding sequence of signals. This 
problem arises in fields such as Doppler imaging where its speci- 
ficity is twofold. First, only short noisy data records are available 
(typically four sample long) and experimental constraints may 
cause spectral aliasing so that measurements provide unreli- 
able, ambiguous information. Second, the frequency sequence 
is smooth. Here, this information is accounted for by a Markov 
model and application of the Bayes rule yields the a posteriori 
density. The maximum a posteriori is computed by a combination 
of Viterbi and descent procedures. One of the major features 
of the method is that it is entirely unsupervised. Adjusting 
the hyperparameters that balance data-based and prior-based 
information is done automatically by ML using an EM-based 
gradient algorithm. We compared the proposed estimate to a 
reference one and found that it performed better: variance 
was greatly reduced and tracking was correct, even beyond the 
Nyquist frequency. 

Index Terms — Frequency tracking, aliasing inversion, reg- 
ularization, Bayesian statistic, maximum a posteriori, Viterbi 
algorithm, hyperparameter estimation, maximum likelihood, EM 
algorithm, Forward-Backward procedure, ultrasonic Doppler 
velocimetry, meteorological Doppler radar. 



I. Introduction 

FREQUENCY TRACKING (or mean frequency tracking) 
is currently of interest [1-6], especially in fields such as 
the ultrasonic characterization of biological tissues, synthetic 
aperture radar, and speech processing. Our main interest is 
its use in Doppler imaging (radars [7], ultrasound blood flow 
mapping [8-10]). There are two main features in this area. 

1) One is that only short noisy data records are available 
(typically four sample long) and they are in a vecto- 
rial form. Moreover, the constraints on the sampling 
frequency may cause spectral aliasing, so that measure- 
ments provide small amounts of ambiguous information. 

2) The second is that there is information on the smooth- 
ness of the sought frequency sequence. This a priori 
information is the foundation of the proposed construc- 
tion. It allows robust tracking, even beyond the Nyquist 
limit. 

The most popular methods used for spectral characterization 
rely on periodogram and empirical correlations. The mean fre- 
quency is usually estimated by computing the mean frequency 
of the periodogram [8] over the standardized frequency range 
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v G (—0.5, +0.5]. Another popular estimate is proportional to 
the phase of the first empirical correlation lag [11, 12]. It is 
also provided by a first-order autoregression in a least squares 
framework [13]. But better accuracy is obtained by using 
all the available estimated correlation lags in a Taylor series 
expansion of the correlation function [12, 14]. The resulting 
estimate is also the mean frequency of the periodogram. 
However, the estimated parameters vary greatly, particularly 
when short data records are used. Moreover, the estimated 
frequency approaches zero when the true frequency becomes 
near the Nyquist frequency i/ ~ ±0.5, (due to the periodogram 
1 -periodicity) [8]. To reduce this bias, [15] uses the maximum 
of the periodogram instead of its mean (and yields a maximum 
likelihood estimate, see Section ITlI- Al and [16, p. 410]), and [8] 
iteratively shifts the frequency of the data. This results in 
greater variance so that no frequency tracking remains possible 
beyond v = ±0.5. 

Thus, all the current methods have two drawbacks. First, 
the tracking problem is tackled by a (necessary sub-optimal) 
two-step procedure: 

1) estimate frequencies in the aliased band (—0.5, +0.5]; 

2) detect and inverse aliasing. 

Second, they are clearly based upon empirical second-order 
statistics that perform poorly with short data records inde- 
pendently processed. Unfortunately, the inverse aliasing in 
step 2 often fails due to the great variations in the estimated 
aliased frequencies of step 1. This is usually compensated 
for by post-smoothing the aliased frequency sequence. This 
provides spatial continuity but affects the aliased frequency 
discontinuities, so limiting the capacity to detect aliasing. The 
proposed method copes with the great variation and aliasing 
in a single step; it models the whole data set (by noisy cisoids) 
and the smoothness of the frequency sequence (by a Markov 
random walk) in the regularization/Bayesian framework. It 
then becomes possible to smooth frequency sequence and 
invert aliasing at the same time, so avoiding the pitfalls of 
chaining these operations. 

We have found few papers [3, 17, 18] that adopt such a 
framework, and this study provides four additional features. 

1) First, it deals with vectorial data records as they occur 
in Doppler imaging (see Section |ll]i. 

2) Second, it enables tracking beyond the Nyquist fre- 
quency, whereas others have not investigated this prob- 
lem. 

3) Third, exact frequency likelihood functions are com- 
puted whereas [17] uses a detection step and [3] uses 
an approximation. 
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4) Lastly, the tracking method is entirely unsupervised, 
with a maximum likelihood hyperparameter estimation. 
This is not a straightforward task in the context of 
frequency tracking, since the non-linear character of the 
data as functions of frequencies prevents the explicit 
handling of the likelihood function of the hyperparame- 
ter. We have developed an EM-like gradient procedure, 
inspired by [19-21]. It can be derived only after dis- 
cretizing the frequencies on a finite grid. 
The paper is organized as follows. The notation, signal 
model and assumptions are defined in Section [III Section |III] 
contains the proposed regularized method and Section |IV] 
gives a discrete approximation. Section [V] is devoted to the 
estimation of hyperparameters. The performance of the pro- 
posed method is demonstrated by the computer simulations 
in Section IVII while Section IVIII gives our conclusion and 
describes possible extensions. 

II. Statement, notations and assumptions 

In Doppler imaging, the signals to be analyzed occur as a set 
of complex signals y = [yi, . . . , i/t] juxtaposed spatially, in T 
range bins [22,23]. The data record yt — [yt(l), . • • , yt{N)Y, 
("t" denotes the matrix transpose) is extracted from a cisoid 
in additive complex noise. The amplitude and the frequency 
of the cisoid are at E (C and G R: 



yt = atz{iyt) + bt = at [1, 



i27ri.t(Af-l)it 



(1) 



The vectors u — [j^i, . . . , i/t]* and a ~ [ai, . . . , arY collect 
the frequencies and corresponding amplitudes. Finally, the true 
parameters are denoted with a star. This paper builds a robust 
estimate P for v* on the basis of data set y (see Fig. [T]for a 
simulated example). 




Depth t 

Fig. 1. Simulated observations over T = 128 range bins witli N = A 
samples per bin. From top to bottom: real parts, imaginary parts of the data 
yt, and the true frequency sequence i/^ . 



Remark 1 — Model (|7} is frequently used for spectral prob- 
lems; it has three main features. First, while it is linear w.r.t. 
at, it is not so w.r.t. Vf' the problem to be solved is non- 
linear Second, z{vt) is a 1-periodic function w.r.t. Vt and 



this causes the difficulties of aliasing, frequency ambiguity, 
likelihood periodicity, etc. . . Lastly, this periodicity is also the 
keystone of the paper: aliasing is inverted using a coherent 
statistical approach that takes periodicity in consideration. 

The following definition of periodicity is used throughout 
the paper. 

Definition 1 — Let A C R"^ and : A ^ H. Let us note 
1 = [1, . . . , 1]* e R^. Lp is said: 

• separately- 1-periodic (SIP) ifVu e A Vfe G 'ZF (such 
that u + k E A): ip{u) — ip{u + k). 

. globally- 1-periodic (GIP) ifiu G R^, Vfco G 7L (such 
that u + fcol G A): if{u) = (p{u + fcol). 

The proposed estimation method deals with periodicity and 
aliasing inversion thanks to the following assumptions. They 
are stated for the sake of simplicity and calculation tractability 
as well as coherence with the appUcations under the scope of 
this paper. 

« Parameter dependence. 

- Hi. a, ly and the bt are independent 

• Law for measurement and modeling noise bt- 

- H|: each bt is J\f{rblN) 

- Hj: the sequence of bt is itself white 

• Law for parameters a and v. 

- H|: a is Af{ralT), i-e., white 

- Hg! 1/ is, on the contrary, correlated: N{R^) 
where N{R) stands for a complex zero-mean Gaussian vector 
with covariance R, and Ip,P E N* denotes the PxP identity 
matrix. 

The first assumption Hi is quite natural since no information 
is available about the relative fluctuations of noise and objects. 
The assumptions Hf , and H2 are also natural since no corre- 
lation structure is expected in noise. Similarly, we have no 
information about the variation of the amplitude sequence, so 
an independent law is used. A Gaussian law is preferred (Hg) 
to make the calculations tractable. Contrarily, the smoothness 
of the frequency sequence is modeled as a positive correlation. 
A Markovian structure (specified below) is a simple, useful 
way to account for it. Several choices are available, but the 
Gaussian one is also stated for the sake of simplicity (H3). 

III. Proposed method 

A. Likelihood 

Assumption H| yields a parametric structure for each like- 
lihood function /(j/f | z^t, 04): 



fiVt I i^t,at) = (Trrfc) exp 



-—CLL(vt, at) 

n 



involving the opposite of the logarithm of the likelihood 
function (up to constant terms) i.e., the Co-Log-Likelihood 
(CLL): 

CLL{vt, at) = [yt - atz{vt)]^ [yt - atz(vt)] ■ 

From a deterministic standpoint, CLL{iyt,o,t) is clearly the 
Least Squares (LS) estimation criterion. 



GIOVANNELLI et al: FREQUENCY TRACKING BEYOND THE NYQUIST FREQUENCY USING MARKOV CHAINS 



3 



Considering the whole frequency vector v and the whole 
data set y, assumption H2 yields: 



f{y\u,a) - (Trrb)-^^^ exp 



1 



CLL{iy,a) 



(2) 



where the global CLL is a global LS criterion: 

T 

CLL(iy, a) = ^ CLL{vt,at) . 
t=i 

Remark 2 — According to Definition\l\ the likelihood func- 
tion CLL{ ■ , a) is SIP for all a e <C'^. So, two configurations 
u and v + k (k E TlF ) for the frequency sequence are equi- 
likelihood. As a consequence, an ML approach suffers from T 
independent frequency ambiguities. 

B. Amplitude law and marginalization 

The parameters of interest are the frequencies, while the 
amplitudes are nuisance parameters. These are integrated out 
of the problem in the usual Bayesian approach. 

Given separability assumption Hi one has f{i/,a) = 
f{u)f{a) and the marginal law can easily be deduced: 

f{y,v)^f{u) I f{y\a,u)f{a)da^ f{v)f{y\u). 

J a 

The joint law for the amplitudes is separable according 
to assumption H3. Since likelihood (|2]l is also separable, 
marginalization can be performed independently. 

T T 

f{y\t^)^\{j fiVt I 'yt,at)f{at)dat. = I ^*)- 

1=1-' "-t t=i 

(3) 

The Gaussian amplitude assumption H| results in analytic 
derivations and yield the marginal likelihood for the data yt 
given vt- a zero mean Gaussian vector. Its covariance Rt is 
given in Appendix |B] as well as its determinant (l23T l and its 
inverse ( |24] |. f{yt \ vt) then reads: 



f{Vt I ft) = /3 cxp [--it] exp [aPt{vt) 



with a = Nra/{n{Nra + n)), /3 



/{Nra 



(4) 

n). 



It 



yjyt/ri,, and Pt is the periodogram of vector yt 



1 



N 

E 

n=l 



The joint law for the whole data set given the frequency 
sequence is obtained by the product Q: 



f{y I iv) = exp [-7] exp [-aCLML{iy)] 



(5) 



where 7 is the sum of the jt for t G = {1, . . . , T} and 
where CLML is the Co-Log-Marginal-Likelihood 



CLML{v) 



(6) 



the opposite of the sum of the periodograms of data yt at 
frequency vt, in gate t. 



Remark 3 — This remark is the marginal counterpart of 
Remark^ As well as CLL{ ■ , a), CLML{ ■ ) is SIP: there are 
still many ambiguities as in the non-marginal case. This was 
expected since no information about the frequency sequence 
has been accounted for in CLML{i') w.r.t. CLL{i',a). In 
contrast, periodicity will be eliminated in the next subsection 
by accounting for the frequency sequence smoothness. 

C. Prior law for frequency sequence 

Unlike amplitudes, the frequency sequence is smooth. A 
Markovian structure accurately accounts for this information, 
and there are many algorithms suited to computing this struc- 
ture. The choice of the family law is not crucial for using these 
algorithms, but we have used the Gaussian family: 



f{vt+i I i^t) = (27rr,)-i/2 



exp 



The complete law for the chain also involves the initial state. 
It is assumed to be uniformly distributed over a symmetric set 
K defined by if e N*: K = [-K/2;+K/2]. So f{vi) = 
[l / K)l°j^{vi) where 1^^ is 1 in K and outside. 
The recursive conditioning rule immediately yields: 



/H^(2vr,v)-(^-i)/2exp 
where CLP{v) is the Co-Log-Prior: 



T-1 



CLP{u) = Kl^{v,) + Y.i 



vt+i - Vt) 



(7) 



(8) 



K = 2r^\ogK and 1^ is 1 in K and +00 outside. In the 
deterministic framework CLP{i/) is a quadratic norm for the 
first-order differences, namely a regularization term [24-26]. 

D. Posterior law 

Fusion of prior-hased and data-based information is 
achieved by the Bayes rule, which provides the a posteriori 
density for i/: 

f{u\y)^}{y\v)}{u)/f(y). 

The marginal law f{y) for the whole data set 3^ is not 
analytically tractable, essentially due to the non-linearity of 
the periodogram w.rt. vt and the correlated structure of 
u. Fortunately, this p.d.f. does not depend on v, so the a 
posteriori density remains explicit up to a positive constant. 
Prior structure of Eq. (|7ll8]l and likelihood structure of Eq. ^ 
|6]l immediately yield the posterior law: 



f{u I y) cx exp [~aCLPL{v)] 



(9) 



where the Co-Log-Posterior-Likelihood function (CLPL) 
reads: 



t=i 



t=i 



(10) 

with A = l/2ar^, up to irrelevant constants. In the determin- 
istic framework, CLPL is a Regularized Least Squares (RLS) 
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criterion. It has three terms, one measures fidelity to the data, 
the second measures fidelity to the prior smoothness and the 
third enforces the first frequency ux e K. The regularization 
parameter A (depending on hyperparameters r = [ra,rb,ry]) 
balances the compromise between pnor-based and data-based 
information. 

E. Point estimate 

As a point estimate, a popular choice is the Maximum A 
Posteriori (MAP) i.e., the maximizer of the posterior law of 
Eq. (|9j or the minimizer of the RLS criterion ( fTOl i: 

pMAP _ argmax/(i' | 3^) — argmin CiPL(i') . (11) 





-5 -4 -3 -2 -1 1 2 3 4 5 

-1 750 I : : : : : : : : : , 




Frequency v 



Fig. 2. Typical form of criteria. From top to bottom: CLML{u) (peri- 
odic), C'LP{w) (quadratic) and CLPL{v) as a function of ut (t = 50). 
Regularization breaks periodicity. 



F. Optimization stage 

The proposed approach allows ambiguous periodicity to 
be removed at the expense of accepting local minima in 
the built energy ( fTOl i. A gradient procedure [27] can achieve 
local minimization of ( fTOb and CLPL gradient involves the 
periodograms derivatives 

N-l 
n=l-N 

when rewriting Ptiyt) as a function of empirical correlation 
lags Ct(n) of the signal yt. It is also possible to calculate the 
second-order derivative 

P;'(i/t) = -47r2 n^Zt{n)e^^''''''^ 

n=l-N 

and to implement second-order descent algorithms. 

There are several ways of coping with global optimization, 
e.g., graduated non-convexity [28,29], stochastic algorithms 
such as simulated annealing [30, 31]. We have used a dynamic 
programming procedure for computational simplicity. It is 
based on a discrete approximation of the prior law for the 
frequencies. This approximation allows global optimization 
(on an arbitrary fine discrete frequency grid) and provides a 
convenient framework for estimating hyperparameters. 

IV. Discrete state Markov chain 
This section is devoted to a discrete approximation to 

1) maximize posterior law for the frequency sequence u 

2) build an ML procedure for estimating hyperparameters. 
We have therefore introduced an equally spaced discretization 
of the frequency range [v^] vm] in P states , . . . {vm = 
— i^in = 2.5 and P = 128 in our simulations). 



Remark 4 — This remark is the posterior counterpart of 
Remarks |2] and \3\ Whereas CLL and CLML are SIP, CLPL 
is not: regularization breaks periodicities, favors solutions 
according to prior probabilities, and enables some ambigu- 
ities to be removed. Nevertheless, a global indetermination 
remains: CLPL is a GIP function. This is essentially due 
to (i) the marginal likelihood CLML is a SIP function and 
(ii) the regularization term CLP is a GIP function (since 
it only involves frequency differences). As a consequence, 
two frequency profiles, different from a constant integer level 
remain equi-likelihood. Finally, the latter indeterminacy can 
be removed by choosing an appropriate K: K = \ enforces 
the first frequency vi to remain in (—0.5, +0.5] and the 
corresponding CLPL is no longer GIP. 



A. Probabilities 

Discretization and normalization of the a priori law ([7]) 
yields the state transition probabilities: 

Ft(p,g) = Pr h+i = i/P I = 1/9] 

exp(-K-^^)V2r.) ^^^^ 

Note that P( does not depend upon t, i.e., the proposed chain is 
homogeneous = F. The full state model also includes the 
initial probabilities p(p), chosen constant over (—0.5, +0.5] 
(see Remark nil. 

The marginal (w.r.t. amplitudes) likelihood function for the 
observation sequence given by Eq. ^ yields the observation 
probability distribution (Dt(p) = f{yt \ vt = v^). 



Proposition 1 — With the previous notations and definitions, 
the MAP estimate is such that: 

|PMAP_^MAP| ^ for teN^_i (12) 
Proof: See appendix iB] ■ 



B. Available algorithms 

The Markov chain is now convenient for using algorithms 
given in [32, 33], the Viterbi and the Forward-Backward algo- 
rithms. They enable us to compute 

. the MAP and 

• the hyperparameters likelihood as well as its gradient. 
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1} The Viterbi algorithm: The Viterbi algorithm. Ap- 
pendix |E] has been implemented to cope with global opti- 
mization (on a discrete grid), and performs a step-by-step 
optimization of the posterior law. The required observation 
probabilities are also readily pre-computable by FFT. 

2) Forward - Backward algorithm: We have used a nor- 
malized version of the procedure, as recommended in [34, 35] 
to avoid computational problems. It is founded on forward and 
backward probabilities: 



and^.(,)^^'[^-^"^* 



where yf 



yt'] denotes the partial observation 



"t - [yt,- 

matrix from time t to t'. 

The (count-up) Forward algorithm, given in Appendix [F] 
computes non normalized probabilities J-t{p), normalization 

coefficients Aft and the Ttip) themselves. As a result, the 
observation likelihood can be deduced: 

p 

PT[y] = l[Mt. (14) 

i = l 

It is useful for estimating ML hyperparameters in Section [V] 
The (count-down) Backward step, described in Appendix iGl 
yields marginal a posteriori probabilities (see [32, p. 10]): 

Ptip) - Pr [lyt ^ly^y]^ Mp)l3tip) (15) 

and double marginal a posteriori probabilities (see [32, p. 11]) 

pt{p,q) = Pr\vt-x^v\vt^v-P\y\ 

= MtTt-x{p)Bt{q)V{p,q)^M), (16) 

both needed to calculate the likelihood gradient. 

V. Estimating Hyperparameters 

The MAP estimate of Eq. ( fTTT i depends upon a unique 
regularization parameter A, function of three hyperparameters 
f = [''a, f'b, This section is devoted to their estimation 
using the available data set y. 

Estimating hyperparameters within the regularization frame- 
work is generally a delicate problem. It has been extensively 
studied, several techniques have been proposed and compared 
[36-41] and the preferred strategy is founded on ML. 

The ML estimation consists of {i) expressing the Hyper- 
parameter Likelihood (HL) as HLy{r) — f{y) and 
maximizing the resulting function. Although we have chosen a 
simple Gaussian law, v cannot be marginalized in closed form 
because v enters f{y\v) in a complex manner Fortunately, 
the discrete state approximation of Section |IV] provides a 
satisfactory solution to this problem. It also allows us to devise 
several kinds of algorithms for local maximization of the likeli- 
hood. One such scheme is the acknowledged EM (Expectation- 
Maximization) algorithm, although its application reveals un- 
easy in the present context of a parametric model of hidden 
Markov chain ([19] provides a meaningful discussion of such 
situations, see also [20,21]). Subsection IV-BI is devoted to 
the EM framework, within which a gradient procedure is 
proposed. Subsection IV-AI deals with the computation of 
the likelihood and proposes a simple coordinatewise descent 
procedure. 



A. Hyperparameter likelihood 

The hyperparameter likelihood HLy can be deduced from 
the joint law for (iv, y) by frequency marginalization: 



HLy{r)= ^^[y,^i = ^- 



Pi 



but the indices run over states, so the above summation 
is not directly tractable. However, the Forward procedure effi- 
ciently achieves a recursive marginalization: it yields HLy{r) 
according to Eq. (fl4] i and requires about TP^ calculations. 

Let us introduce the Co-Log-HL (CLHL) to be minimized 
w.r.t. hyperparameters vector r: 

f"^ = £LTgmmCLHLy{r). 

r 

One possible optimization scheme is a coordinatewise descent 
algorithm, with a golden section line search [27]. But a more 
efficient scheme may be a gradient algorithm [27]. 



B. Likelihood gradient 

The EM algorithm relies on an auxiliary function, usually 
denoted Q [42,43] built on two hyperparameter vectors r and 
r' by completing the observed data set y with parameters to 
be marginalized u: 



g(r,r') = E^[log{Pv[u,y;r'] 



y;r 



= Y^logPi [u,y;r'] Pv[{u\y;r)]. 



With the proposed notations, usual hidden Markov chains 
calculations yield: 



Q{r,r') 



T P 
t=2 p,q=l 



(17) 



p 

E 

p=i 



T P 



p{p) logp'(p) logOtb) 

t=i p=i 



where: 



• (p',F',(D') and (p,P,(D) are parameters of the model 
under hyperparameters r' and r, respectively. 

• pt (p) and pt {p, q) denote the a posteriori marginal laws 
defined by (flST l and (fTST l. under hyperparameters r. 

The fcth iteration of the EM scheme maximizes 
Q{r'^^^'^\r') as a function of r', to yield r'*^) as the 
maximizer Unfortunately, it seems impossible to derive 
an explicit expression for such a maximizer However, an 
alternate route can be followed, given the key property: 



dQ{ry) 



dr' 



dCLHLy{r) 
dr 
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As suggested by [19], this property enables us to calculate the 
gradient of CLHLy{r) as the derivative of (fTTI i : 



T P 



t=i p=i 

T P 



(18) 



T P 



= Z^Z^ptfe?) — ■ (20) 



t=2 p,q=l 



dr' 



The encountered derivatives d log (D' (p) / dr'^, d log ©' (p) / 
and 9 log P' (p, g) / respectively read: 

^ r=l 

by derivation of (01 and ( fTST l. Finally the likelihood gradient 
is readily calculated and a gradient procedure can be appUed. 

VI. Simulation results and comparisons 

The previous Sections introduced a regularized method for 
frequency tracking and estimating hyperparameters. This Sec- 
tion demonstrates the practical effectiveness of the proposed 
approach by processing simulated signals shown in Fig. [l] 



The coordinate-wise descent algorithm performed well, 
since it does not require any gradient calculation. Gradient 
calculus needs much more computation than the likelihood 
itself, due to summations in Eq. (fT8]l-(l20ll. Likelihood calculus 
took 0.05 s, while gradient calculus required 0.2 s., i.e., about 
four times more. 

We have therefore adopted the two fastest methods: 
coordinate-wise and Polak-Ribiere pseudo-conjugate gradient, 
which took less than 3.5 s. Fig. [3] also illustrates the conver- 
gence. 

B. Frequency tracking 

The optimization procedure used to compute the MAP 
(given ML hyperparameters) consisted of applying the Viterbi 
algorithm (described in Section lTV-B.ll i. The solution was used 
as the starting point for the gradient or the Hessian procedure 
(described in Section llll-FI ). The Viterbi algorithm explored the 
whole set of possible frequencies (on a discrete grid) and found 
the correct interval for each frequency, while the gradient or 
Hessian procedure locally refined the optimum. Table HH shows 
the computation times. We adopted the Hessian procedure, 
since it performed almost 10 times faster. 



Method 


Time (s) 


MAP Viterbi 


0.13 


MAP Gradient 
MAP Hessian 


4.82 
0.51 



TABLE II 

Computation times comparison for frequency estimate. 



A. Hyperparameter estimation 

The hyperparameter likelihood function CLHL was first 
computed on a fine discrete grid of 25 x 25 x 25 values resulting 
in the level sets shown in Fig.|3] The function is fairly regular, 
and has a single minimum. 

The hyperparameters are tuned using two classes of descent 
algorithms: 

• a coordinate-wise descent algorithm 

• a gradient descent algorithm. 

The latter employs several descent directions: usual gradi- 
ent, bisector correction, Vignes correction and Polak-Ribiere 
pseudo-conjugate direction. Two line search methods have also 
been implemented: usual dichotomy and quadratic interpola- 
tion. The starting point remains the empirical hyperparameter 
vector described in Appendix |G] 

All the strategies provide the correct minimizer and they are 
compared TableUand Fig. [3] The usual gradient generated zig- 
zagging trajectories and was slower than the other strategies. 
The three corrected direction strategies were 25 to 40 % faster 
than the uncorrected ones, with the Polak-Ribiere pseudo- 
conjugate direction having a slight advantage. In contrast, 
interpolation did not result in any improvement within the 
corrected direction class. 

'Algorithms have been implemented using the computing environment 
Matlab on a Personal Computer, Pentium III, with a 450 MHz CPU and 
128 MB of RAM. 
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Fig. 4. Comparison of frequency profile estimates. From top to bottom: 
ML estimate {i.e., periodogram maximizer), unwrapped ML estimate, Viterbi- 
MAP estimate and Hessian-MAP estimate. 

Fig. |4] illustrates typical results. The ML strategy: 

- lacked robustness for two reasons: estimation was per- 
formed independently at each depth and N was small; 

- could not be corrected by an unwrap-like post-proces- 
sing since the ML solution was too rough (as already 
mentioned). 
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Fig. 3. Hyperparameter likelihood: typical behavior. Level sets of CLHL are plotted as dashed lines (--). The minima are located by a star (*), starting 
points (empirical estimates) by a dot (.) and final estimate by a circle (o). First row gives coordinate-wise algorithm and second row gives gradient algorithm. 
First column: CLH L{r^^^ , ri,,r,y), second column: CLH L{ra,r^^ ,r,j), third column: CLHL(ra,ri,,r^^). Each figure is logjQ-scaled. 



Method 


Reached minimum 


logio 


logic 'C" 


logio ^''^ 


Grad./Fun. 


Time (s) 


(la) 


4.513 10^ 


0.297 


-0.685 


-2.424 


17/59 


5.55 


(lb) 


4.495 10^ 


0.297 


-0.679 


-2.519 


13/87 


5.92 


(2a) 


4.494 10^ 


0.292 


-0.678 


-2.537 


9 /49 


3.77 


(2b) 


4.494 10^ 


0.299 


-0.681 


-2.554 


13/92 


6.14 


(3a) 


4.498 10^ 


0.297 


-0.695 


-2.589 


9/53 


4.07 


(3b) 


4.494 10^ 


0.298 


-0.679 


-2.547 


13/92 


6.21 


(4a) 


4.497 10^ 


0.283 


-0.674 


-2.507 


7 /40 


3.12 


(4b) 


4.500 10^ 


0.297 


-0.685 


-2.618 


9 /75 


4.84 


(5) 


4.495 10^ 


0.300 


-0.671 


-2.559 


/81 


3.41 



TABLE I 

Descent ALGORITHM comparison. The first column gives the method at work: (1) usual gradient, (2) Vignes correction, (3) 

BISECTOR correction AND (4) POLAK-RlBIERE PSEUDO-CONJUGATE DIRECTION, (a) NO INTERPOLATION AND (b) QUADRATIC INTERPOLATION. (5) IS 
THE COORDINATE- WISE DESCENT METHOD. THE FOLLOWING COLUMNS SHOW THE REACHED MINIMUM AND THE MINIMIZER. THE SIXTH COLUMN 
GIVES THE NUMBER OF GRADIENTS AND FUNCTION CALCULUS WHILE THE LAST GIVES COMPUTATION TIMES IN SECONDS (s). 



For the regularized solution (also given in Fig. |4|, a simple 
qualitative comparison with the reference led to three conclu- 
sions. 

- The estimated frequency sequence conformed much bet- 
ter to the true one. The frequency sequence was more 
regular, since smoothness was introduced as a prior 
feature. 

- The estimated frequency sequence remained close to 
the true one even beyond the usual Nyquist frequency. 
This was essentially due to the coherent accounting for 
the whole set of data and smoothness of the frequency 
sequence. 

- The proposed strategy for estimating hyperparameters 
is adequate. A variation of 0.1 of the hyperparameters 
resulted in an almost imperceptible variation in the esti- 



mated frequency sequence. This is especially important 
for qualifying the robustness of the proposed method: 
the choice of r offers relatively broad leeway and can be 
reliably made. 

VII. Conclusion and perspectives 

This paper examines the problem of frequency tracking 
beyond the Nyquist frequency as it occurs in Doppler imaging, 
when only short noisy data records are available. A solution 
is proposed in the Bayesian framework based upon hidden 
Gauss-Markov models accounting for prior smoothness of the 
frequency sequence. We have developed a computationally 
efficient combination of dynamic programming and a Hessian 
procedure to calculate the maximum a posteriori. The method 
is entirely unsupervised and uses an ML procedure based on an 
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original EM-based gradient procedure. The estimation of the 
ML hyperparameter is both formally achievable and practically 
useful. 

This new Bayesian method allows tracking beyond the usual 
Nyquist frequency, due to a coherent statistical framework that 
includes the whole set of data plus smoothness prior. To our 
knowledge, this capability is an original contribution to the 
field of frequency tracking. 

Future work may include the extension to Gaussian DSP [9], 
to multiple frequencies tracking [3, 17], and to the 2D problem. 
The latter and its connection to 2D phase unwrapping [44^6] 
is presently being investigated. 

Appendix 

A. Preliminary results 

This Section includes two useful results: for u G <C^ 

l + ittw (21) 

(In+u^u) = In-— — r- (22) 
^ ' 1 + u'u 

where In stands for the N x N identity matrix. 

B. Law for {vt\vt) 

Linearity of model ([U w.rt. amplitudes and assumptions 
for at and 6t, allow easy marginalization of {yt,at\i't)- yt\vt 
is clearly a zero-mean and Gaussian vector with covariance 
Rt = raz{i>t)z{vt)'^ + t^In- From relation (l2Tl l and (|22] | its 
determinant and inverse reads: 

i?r' = -In - ^z{vt)z{vt)^ (23) 



det \In + MM^] 



det Rt 



n N 



(24) 



C. Preliminary result 

The proposed proof is based on the decimal part function 
-0:11 — > [-0.5; +0.5[ defined by 

{d{x) = a; if a; e [-0.5; +0.5), 
1 13 is 1 -periodic, 

and the following straightforward properties 

D{x + k) = D{x), k(^% 
\D(x)\ ^ |a:| 
|D(a:)| s; 1/2 

y = -D(a;) ^ 3fc G Z such that y = a; + 



(25) 



(26) 
(27) 
(28) 
(29) 



D. Proof of proposition 

Let us define a frequency sequence v (with CLPL(y) < 
Qo) which does not verify Eq. (fT2] l of Proposition [T] i.e., 

3to G with \iyto+i - lytol > 1/2 . (30) 

Let us recursively build a new frequency sequence 

vi = vi (31) 
n+i = Vt + D{j^t+i'^t) for t^l,...,T -1(32) 



and prove that Eq. ( fT2b of Proposition [T] holds for v and that 
the criterion CLPL reduces from v to !>: 

\l^t+i-^t\ ^ 1/2 for iGN5._i, (33) 
CLPLiu) < CLPL{u). (34) 

• Relation (|33] | is straightforward: by Eq. ( l32b . one can see 

vt+i -vt^ D{vt+i - vt) for t G N5^_i 
and hence, by Property ( |28] ): 

\Vt+i-vt \ ^ 1/2 for t G N;,_i. 

• Proof of ( |34] | takes three steps, corresponding to each term 
of CLPL ([Toll. By Eq. ( |3T1l32] i and Property one can see 

3kt G TL such that vt ^ vt + h for t elM^ , (35) 

(with ki = 0) so, 

Ptiiyt) ^Pti^t) for teTM*r. (36) 

By Eq. (l32b and ( l35T l and invoking property (|26] | we have 

t't+i - i^t = D{vt+i - I't) = D{vt+i - Vt) 
hence, accounting for property dZTl i: 

\vt+i - vt\ ^ \i't+i - vtl ■ (37) 
Moreover, for t — Iq, we clearly have 

l^^to+l - i^tol < Wto+1 - Vto\ (38) 

thanks to hypothesis ( [30l ). Finally, we have: 

Ix(^i) = Ik(^^i) (39) 
Collecting (O, (O, m and ^ proves (O. 

£. The Yiterbi algorithm 



Pre-computations 

V{p,q) = AK-z/9)2 ip,q€Wp) 
C{p,t) = -Pt{vP) (pGNJ^iGN^.) 

Initialization (t = 1) 

Ci(p) = £(p,l)l?°(z.'') (pGN^) 
Iterations it ^2, ... ,T) 

Ct{p,q) = Cf_i(q) + {p,qe 
Ctip) = mingCt(p^9) (p G 

Vt {p) = arg min^ Ct (p, g) (p G NJ 

Termination (t = T) 



Pt 



arg min Ct (p) 

p 



Back tracking (t = T - 1, . . . , 1) 

Ft = 'Pt{pt+i) 



Wp) 
) 
) 
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F. The Forward algorithm 
« Initialization (t = 1) 

p 

q=l 

• Iterations (t = 2,...,T) 

p 

Mp) = ©t(p)^^t_ib)F(g,p) (p&Wp) 

9=1 

P 
q=l 

Tt{p) - Tt(p)INt {p^Wp) 



G. The Backward algorithm 



• Initialization (t = T) 






Brip) = 1 
St(p) = 1 






• Iterations (i T - 1, . . . , 1) 






p 

Bt{p) = 5]©t+i(g)Bt+i(p)P(p,g) 




Bt{p) = Btip)/Mt+i 




iP e N^) 



This Section is devoted to the empirical estimation of 
hyperparameters used as a starting point in the maximization 
procedures of Section IVI-AI These estimates are based on 
the correlation r(n) of ytji't and easily shown to verify: 
^■(0) — ra +rb , and |r(l)| = r^, for all t e N^-. Empirical 
estimates r(0) and are computed from the whole data set 
y and remain robust, since T is large (even if N is small). 
Finally, one can compute fa = |f(l)| ,andf;, — r{0) ~ |f(l)|. 

For r^, the estimation is based on the ML estimate of 
the frequency sequence in each range bin t G N^. The 
proposed empirical estimate of is naturally the empirical 
variance of the differences between the ML frequencies. This 
procedure yields an overestimated value for r^. This result is 
expected, since the sequence of ML frequencies varies greatly 
and has discontinuities, as mentioned above. Nevertheless, 
this estimate is a suitable starting point for the maximization 
procedures of Section IVI-AI 
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